Search CORE

42 research outputs found

Predicting genome-wide redundancy using machine learning

Author: Bandyopadhyay Sunayan
Birnbaum Kenneth D
Chen Huang-Wen
Shasha Dennis E
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Gene duplication can lead to genetic redundancy, which masks the function of mutated genes in genetic analyses. Methods to increase sensitivity in identifying genetic redundancy can improve the efficiency of reverse genetics and lend insights into the evolutionary outcomes of gene duplication. Machine learning techniques are well suited to classifying gene family members into redundant and non-redundant gene pairs in model species where sufficient genetic and genomic data is available, such as <it>Arabidopsis thaliana</it>, the test case used here. Results Machine learning techniques that combine multiple attributes led to a dramatic improvement in predicting genetic redundancy over single trait classifiers alone, such as BLAST E-values or expression correlation. In withholding analysis, one of the methods used here, Support Vector Machines, was two-fold more precise than single attribute classifiers, reaching a level where the majority of redundant calls were correctly labeled. Using this higher confidence in identifying redundancy, machine learning predicts that about half of all genes in <it>Arabidopsis </it>showed the signature of predicted redundancy with at least one but typically less than three other family members. Interestingly, a large proportion of predicted redundant gene pairs were relatively old duplications (e.g., Ks > 1), suggesting that redundancy is stable over long evolutionary periods. Conclusions Machine learning predicts that most genes will have a functionally redundant paralog but will exhibit redundancy with relatively few genes within a family. The predictions and gene pair attributes for <it>Arabidopsis </it>provide a new resource for research in genetics and genome evolution. These techniques can now be applied to other organisms.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Predictive network modeling of the high-resolution dynamic plant transcriptome in response to nitrate

Author: Coruzzi Gloria M
Krouk Gabriel
LeCun Yann
Mirowski Piotr
Shasha Dennis E
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

International audienceABSTRACT: BACKGROUND: Nitrate, acting as both a nitrogen source and a signaling molecule, controls many aspects of plant development. However, gene networks involved in plant adaptation to fluctuating nitrate environments have not yet been identified. RESULTS: Here we use time-series transcriptome data to decipher gene relationships and consequently to build core regulatory networks involved in Arabidopsis root adaptation to nitrate provision. The experimental approach has been to monitor genome-wide responses to nitrate at 3, 6, 9, 12, 15 and 20 minutes, using Affymetrix ATH1 gene chips. This high-resolution time course analysis demonstrated that the previously known primary nitrate response is actually preceded by a very fast gene expression modulation, involving genes and functions needed to prepare plants to use or reduce nitrate. A state-space model inferred from this microarray time-series data successfully predicts gene behavior in unlearnt conditions. CONCLUSIONS: The experiments and methods allow us to propose a temporal working model for nitrate-driven gene networks. This network model is tested both in silico and experimentally. For example, the over-expression of a predicted gene hub encoding a transcription factor induced early in the cascade indeed leads to the modification of the kinetic nitrate response of sentinel genes such as NIR, NIA2, and NRT1.1, and several other transcription factors. The potential nitrate /hormone connections implicated by this time-series data is also evaluated

Crossref

Springer - Publisher Connector

PubMed Central

HAL Descartes

Qualitative network models and genome-wide expression data define carbon/nitrogen-responsive molecular machines in Arabidopsis

Author: Chiaromonte Francesca
Coruzzi Gloria M
Dean Alexis
Gutiérrez Rodrigo A
Lejay Laurence V
Shasha Dennis E
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: Carbon (C) and nitrogen (N) metabolites can regulate gene expression in Arabidopsis thaliana. Here, we use multinetwork analysis of microarray data to identify molecular networks regulated by C and N in the Arabidopsis root system. RESULTS: We used the Arabidopsis whole genome Affymetrix gene chip to explore global gene expression responses in plants exposed transiently to a matrix of C and N treatments. We used ANOVA analysis to define quantitative models of regulation for all detected genes. Our results suggest that about half of the Arabidopsis transcriptome is regulated by C, N or CN interactions. We found ample evidence for interactions between C and N that include genes involved in metabolic pathways, protein degradation and auxin signaling. To provide a global, yet detailed, view of how the cell molecular network is adjusted in response to the CN treatments, we constructed a qualitative multinetwork model of the Arabidopsis metabolic and regulatory molecular network, including 6,176 genes, 1,459 metabolites and 230,900 interactions among them. We integrated the quantitative models of CN gene regulation with the wiring diagram in the multinetwork, and identified specific interacting genes in biological modules that respond to C, N or CN treatments. CONCLUSION: Our results indicate that CN regulation occurs at multiple levels, including potential post-transcriptional control by microRNAs. The network analysis of our systematic dataset of CN treatments indicates that CN sensing is a mechanism that coordinates the global and coordinated regulation of specific sets of molecular machines in the plant cell

Springer - Publisher Connector

PubMed Central

An integrated genetic, genomic and systems approach defines gene networks regulated by the interaction of light and carbon signaling pathways in Arabidopsis

Author: Coruzzi Gloria M
Gutiérrez Rodrigo A
Katari Manpreet S
Mukherjee Indrani
Nero Damion
Shasha Dennis
Shin Michael J
Thum Karen E
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Light and carbon are two important interacting signals affecting plant growth and development. The mechanism(s) and/or genes involved in sensing and/or mediating the signaling pathways involving these interactions are unknown. This study integrates genetic, genomic and systems approaches to identify a genetically perturbed gene network that is regulated by the interaction of carbon and light signaling in Arabidopsis. Results Carbon and light insensitive (<it>cli</it>) mutants were isolated. Microarray data from <it>cli186 </it>is analyzed to identify the genes, biological processes and gene networks affected by the integration of light and carbon pathways. Analysis of this data reveals 966 genes regulated by light and/or carbon signaling in wild-type. In <it>cli186</it>, 216 of these light/carbon regulated genes are misregulated in response to light and/or carbon treatments where 78% are misregulated in response to light and carbon interactions. Analysis of the gene lists show that genes in the biological processes "energy" and "metabolism" are over-represented among the 966 genes regulated by carbon and/or light in wild-type, and the 216 misregulated genes in <it>cli186</it>. To understand connections among carbon and/or light regulated genes in wild-type and the misregulated genes in <it>cli186</it>, the microarray data is interpreted in the context of metabolic and regulatory networks. The network created from the 966 light/carbon regulated genes in wild-type, reveals that <it>cli186 </it>is affected in the light and/or carbon regulation of a network of 60 connected genes, including six transcription factors. One transcription factor, HAT22 appears to be a regulatory "hub" in the <it>cli186 </it>network as it shows regulatory connections linking a metabolic network of genes involved in "amino acid metabolism", "C-compound/carbohydrate metabolism" and "glycolysis/gluconeogenesis". Conclusion The global misregulation of gene networks controlled by light and carbon signaling in <it>cli186 </it>indicates that it represents one of the first Arabidopsis mutants isolated that is specifically disrupted in the integration of both carbon and light signals to control the regulation of metabolic, developmental and regulatory genes. The network analysis of misregulated genes suggests that <it>CLI186 </it>acts to integrate light and carbon signaling interactions and is a master regulator connecting the regulation of a host of downstream metabolic and regulatory processes.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Messiah College: MOSAIC (Messiah's Open Scholarship And Intellectual Creativity)

GraphFind: enhancing graph searching by low support data mining techniques

Author: A Ferro
Alfredo Ferro
Alfredo Pulvirenti
BT Messmer
D Shasha
Dennis Shasha
DJ Cook
Dmitry Skripin
E Cohen
J Cheng
L Cordella
Misael Mongiovì
P Foggia
R Giugno
R Sharan
R Sharan
Rosalba Giugno
S Kumar
S Zhang
X Yan
X Yan
X Yan
Y Tian
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Biomedical and chemical databases are large and rapidly growing in size. Graphs naturally model such kinds of data. To fully exploit the wealth of information in these graph databases, a key role is played by systems that search for all exact or approximate occurrences of a query graph. To deal efficiently with graph searching, advanced methods for indexing, representation and matching of graphs have been proposed. Results This paper presents GraphFind. The system implements efficient graph searching algorithms together with advanced filtering techniques that allow approximate search. It allows users to select candidate subgraphs rather than entire graphs. It implements an effective data storage based also on low-support data mining. Conclusions GraphFind is compared with Frowns, GraphGrep and gIndex. Experiments show that GraphFind outperforms the compared systems on a very large collection of small graphs. The proposed low-support mining technique which applies to any searching system also allows a significant index space reduction.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Catalogo dei prodotti della ricerca

Cell-by-cell dissection of phloem development links a maturation gradient to cell specialization

Publisher Copyright: Copyright © 2021 The Authors, some rights reserved;In the plant meristem, tissue-wide maturation gradients are coordinated with specialized cell networks to establish various developmental phases required for indeterminate growth. Here, we used single-cell transcriptomics to reconstruct the protophloem developmental trajectory from the birth of cell progenitors to terminal differentiation in the Arabidopsis thaliana root. PHLOEM EARLY DNA-BINDING-WITH-ONE-FINGER (PEAR) transcription factors mediate lineage bifurcation by activating guanosine triphosphatase signaling and prime a transcriptional differentiation program. This program is initially repressed by a meristem-wide gradient of PLETHORA transcription factors. Only the dissipation of PLETHORA gradient permits activation of the differentiation program that involves mutual inhibition of early versus late meristem regulators. Thus, for phloem development, broad maturation gradients interface with cell-type-specific transcriptional regulators to stage cellular differentiation.Peer reviewe

Online Research @ Cardiff

Ghent University Academic Bibliography

PubMed Central

Helsingin yliopiston digitaalinen arkisto

Apollo (Cambridge)

Jagiellonian Univeristy Repository

Rational Design of Temperature-Sensitive Alleles Using Computational Structure Prediction

Author: B Cunningham
B Lee
C Cortes
Ca Rohl
Christopher S. Poultney
CJ Burges
David Gresham
Dennis E. Shasha
EH Kellogg
G Chakshusmathi
Glenn L. Butterfoss
HM Muller
JM Word
JR Quinlan
K Bajaj
K Drew
KD Pruitt
Kevin Drew
Kristin C. Gunsalus
M Hall
Michelle R. Gutwein
N Eswar
N Siew
R Varadarajan
Richard Bonneau
RJ Dohmen
S Tweedie
SF Altschul
SF Altschul
TW Harris
Vladimir N. Uversky
WS Noble
WS Sandberg
Publication venue: Public Library of Science
Publication date: 02/09/2011
Field of study

Temperature-sensitive (ts) mutations are mutations that exhibit a mutant phenotype at high or low temperatures and a wild-type phenotype at normal temperature. Temperature-sensitive mutants are valuable tools for geneticists, particularly in the study of essential genes. However, finding ts mutations typically relies on generating and screening many thousands of mutations, which is an expensive and labor-intensive process. Here we describe an in silico method that uses Rosetta and machine learning techniques to predict a highly accurate “top 5” list of ts mutations given the structure of a protein of interest. Rosetta is a protein structure prediction and design code, used here to model and score how proteins accommodate point mutations with side-chain and backbone movements. We show that integrating Rosetta relax-derived features with sequence-based features results in accurate temperature-sensitive mutation predictions

Public Library of Science (PLOS)

Crossref

PubMed Central

A Systems Approach Uncovers Restrictions for Signal Interactions Regulating Genome-wide Responses to Nutritional Cues in Arabidopsis

Author: A Camargo
A Castillon
A Maruyama-Nakashita
A Nieder
Alexis A. Cruikshank
AN Stepanova
B Moore
CS Poultney
Daniel Tranchina
Dennis Shasha
DY Little
E Baena-Gonzalez
EA Vidal
F Rolland
F Rolland
FQ Guo
G Coruzzi
G Fechner
G Krouk
Gabriel Krouk
Gloria M. Coruzzi
H Chen
J Price
JE Malamy
JL Nemhauser
JL Nemhauser
KE Thum
KE Thum
L Lejay
L Lejay
L Lejay
Laurence Lejay
M Mizunami
ML Gifford
P Achard
P Omur-Ozbek
PM Palenchar
RA Gutierrez
RA Gutierrez
Rama Ranganathan
Rodrigo A. Gutiérrez
S Gazzarrini
S Muños
T Speed
T Takahashi
Y Tao
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

As sessile organisms, plants must cope with multiple and combined variations of signals in their environment. However, very few reports have studied the genome-wide effects of systematic signal combinations on gene expression. Here, we evaluate a high level of signal integration, by modeling genome-wide expression patterns under a factorial combination of carbon (C), light (L), and nitrogen (N) as binary factors in two organs (O), roots and leaves. Signal management is different between C, N, and L and in shoots and roots. For example, L is the major factor controlling gene expression in leaves. However, in roots there is no obvious prominent signal, and signal interaction is stronger. The major signal interaction events detected genome wide in Arabidopsis roots are deciphered and summarized in a comprehensive conceptual model. Surprisingly, global analysis of gene expression in response to C, N, L, and O revealed that the number of genes controlled by a signal is proportional to the magnitude of the gene expression changes elicited by the signal. These results uncovered a strong constraining structure in plant cell signaling pathways, which prompted us to propose the existence of a “code” of signal integration

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

HAL Descartes

An expanded evaluation of protein function prediction methods shows an improvement in accuracy

Author: Almeida-e-Silva Danillo C.
Altenhoff Adrian
Babbitt Patricia C.
Bankapur Asma R.
Bargsten Joachim W.
Ben-Hur Asa
Benso Alfredo
Bhat Prajwal
Bkc Dukka
Bonneau Richard
Brenner Steven E.
Bryson Kevin
Cao Renzhi
Casadio Rita
Cejuela Juan M.
Chapman Samuel
Chen Ching-Tai
Cheng Jianlin
Cibrian-Uhalte Elena
Clark Wyatt T.
Cozzetto Domenico
D'Andrea Daniel
Das Sayoni
Dawson Natalie L.
del Pozo Angela
Denny Paul
Dessimoz Christophe
Di Carlo Stefano
Dogan Tunca
ElShal Sarah
Falda Marco
Fang Hai
Feng Shou
Fernández José M.
Ferrari Carlo
Fontana Paolo
Foulger Rebecca E.
Friedberg Iddo
Funk Christopher S.
Gabaldon Toni
Gemovic Branislava
Gillis Jesse
Ginter Filip
Giollo Manuel
Glisic Sanja
Goldberg Tatyana
Gong Qingtian
Gough Julian
Greene Casey S.
Hakala Kai
Hamp Tobias
Hieta Reija
Holm Liisa
Hsu Wen-Lian
Huntley Rachael P.
Jiang Yuxiang
Jones David T.
Kaewphan Suwisa
Kahanda Indika
Kansakar Lakesh
Khan Ishita K.
Kihara Daisuke
Koo Da Chen Emily
Koskinen Patrik
Lavezzo Enrico
Lee David
Lees Jonathan G.
Legge Duncan
Lepore Rosalba
Li Biao
Lin Alexandra
Linial Michal
Lovering Ruth C.
Magrane Michele
Maietta Paolo
Marcet-Houben Marina
Martelli Pier Luigi
Martin Maria J.
Mehryary Farrokh
Melidoni Anna N.
Mesiti Marco
Minneci Federico
Mooney Sean D.
Moreau Yves
Mutowo-Meullenet Prudence
Nepusz Tamás
Ning Wei
O'Donovan Claire
Oates Matt
Ofer Dan
Orengo Christine A.
Oron Tal Ronnen
Paccanaro Alberto
Pavlidis Paul
Penfold-Brown Duncan
Perovic Vladmir
Pichler Klemens
Piovesan Damiano
Politano Gianfranco
Profiti Giuseppe
Radivojac Predrag
Rappoport Nadav
Re Matteo
Rehman Hafeez Ur
Richter Lothar
Robinson Peter N.
Romero Alfonso E.
Rost Burkhard
Sahraeian Sayed M.E.
Salakoski Tapio
Salamov Asaf
Sasidharan Rajkumar
Savino Alessandro
Sedeño-Cortés Adriana E.
Sharan Malvika
Shasha Dennis
Shypitsyna Aleksandra
Sillitoe Ian
Skunca Nives
Smithers Ben
Stern Amos
Sternberg Michael J.E.
Supek Fran
Tian Weidong
Toppo Stefano
Tosatto Silvio C.E.
Tramontano Anna
Tranchevent Léon-Charles
Tress Michael L.
Törönen Petri
Valencia Alfonso
Valentini Giorgio
van Dijk Aalt D.J.
Veljkovic Nevena
Veljkovic Veljko
Vencio Ricardo ZN
Verspoor Karin M.
Vogel Jörg
Vucetic Slobodan
Wang Zheng
Wass Mark N.
Yang Haixuan
Youngs Noah
Zakeri Pooya
Zhang Shanshan
Zhong Zhaolong
Zhou Yuanpeng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Background: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging. Results: We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2. Conclusions: The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent. Keywords: Protein function prediction, Disease gene prioritizationpublishedVersio

Brage HiM

An Expanded Evaluation of Protein Function Prediction Methods Shows an Improvement In Accuracy

Author: Almeida-e-Silva Danillo C.
Altenhoff Adrian
Babbitt Patricia C.
Bankapur Asma R.
Bargsten Joachim W.
Ben-Hur Asa
Benso Alfredo
Bhat Prajwal
BKC Dukka
Bonneau Richard
Brenner Steven E.
Bryson Kevin
Cao Renzhi
Casadio Rita
Cejuela Juan M.
Chapan Samuel
Chen Ching-Tai
Cheng Jianlin
Cibrian-Uhalte Elenia
Clark Wyatt T.
Cozzetto Domenico
D\u27Andrea Daniel
Das Sayoni
Dawson Natalie L.
del Pozo Angela
Denny Paul
Dessimoz Christophe
Di Carlo Stefano
Dogan Tunca
ElShal Sarah
Falda Marco
Fang Hai
Feng Shou
Fernández José M.
Ferrari Carlo
Fontana Paolo
Foulger Rebecca E.
Friedberg Iddo
Funk Christopher S.
Gabaldon Toni
Gemovic Branislava
Gillis Jesse
Ginter Filip
Giollo Manuel
Glisic Sanja
Goldberg Tatyana
Gong Qingtian
Gough Julian
Greene Casey S.
Hakala Kai
Hamp Tobias
Hieta Reija
Holm Liisa
Hsu Wen-Lian
Huntley Rachael P.
Jiang Yuxiang
Jones David T.
Kaewphan Suwisa
Kahanda Indika
Kansakar Lakesh
Khan Ishita K.
Kihara Daisuke
Koo Da Chen Emily
Koskinen Patrik
Lavezzo Enrico
Lee David
Lees Jonathan G.
Legge Duncan
Lepore Rosalba
Li Biao
Lin Alexandra
Linial Michal
Lovering Ruth C.
Magrane Michele
Maietta Paolo
Marcet-Houben Marina
Martelli Pier Luigi
Martin Maria J.
Mehryar Farrokh
Melidoni Anna N.
Mesiti Marco
Minneci Federico
Mooney Sean D.
Moreau Yves
Mutowo-Meullenet Prudence
Nepusz Tamás
Ning Wei
O\u27Donovan Claire
Oates Matt
Ofer Dan
Orengo Christine A.
Oron Tal Ronnen
Paccanaro Alberto
Pavlidis Paul
Penfold-Brown Duncan
Perovic Vladmir
Pichler Klemens
Piovesan Damiano
Politano Gianfranco
Profiti Giuseppe
Radivojac Predrag
Rappoport Nadav
Re Matteo
Rehman Hafeez Ur
Richter Lothar
Robinson Peter N.
Romero Alfonso E.
Rost Burkhard
Sahraeian Sayed M.E.
Salakoski Tapio
Salamov Asaf
Sasidharan Rajkumar
Savino Alessandro
Sedeño-Cortés Adriana E.
Sharan Malvika
Shasha Dennis
Shypitsyna Aleksandra
Skunca Nives
Smithers Ben
Stern Amos
Sternberg Michael J.E.
Stilltoe Ian
Supek Fran
Tian Weidong
Toppo Stefano
Tosatto Silvio C.E.
Tramontano Anna
Tranchevent Léon-Charles
Tress Michael L.
Törönen Petri
Valencia Alfonso
Valentini Giorgio
van Dijk Aalt D.J.
Veljkovic Nevena
Veljkovic Veljko
Vencio Ricardo Z.N.
Verspoor Karin M.
Vogel Jörg
Vucetic Slobodan
Wang Zheng
Wass Mark N.
Yang Haixuan
Youngs Noah
Zakeri Pooya
Zhang Shanshan
Zhong Zhaolong
Zhou Yuanpeng
Publication venue: The Aquila Digital Community
Publication date: 07/09/2016
Field of study

Aquila Digital Community